Detecting “bad” regression models: multicriteria fitness functions in regression analysis
نویسندگان
چکیده
Regression models with good fitting but no predictive ability are sometimes chance correlations and often show some pathological features such as multicollinearity, overfitting, and inclusion of noisy/spurious variables. This problem is well known and of the utmost importance. The present paper proposes some criteria that are to be fulfilled as conditions for model acceptability, the aim being to recognize linear regression models with pathology. These criteria have been thought of in order to face the following problems: • model instability due to outliers and influential objects; • predictor multicollinearity; • redundancy in explanatory variables; • overfitting due to chance factors. A multicriteria fitness function based on the maximization of the Q2 statistics under a set of tests is proposed here. This new fitness function can also be used in model searching by variable selection approaches in order to obtain a final optimal population of models. Computations on the Selwood data set are reported to illustrate the use of this multicriteria fitness function in model searching. © 2003 Elsevier B.V. All rights reserved.
منابع مشابه
Penalized Estimators in Cox Regression Model
The proportional hazard Cox regression models play a key role in analyzing censored survival data. We use penalized methods in high dimensional scenarios to achieve more efficient models. This article reviews the penalized Cox regression for some frequently used penalty functions. Analysis of medical data namely ”mgus2” confirms the penalized Cox regression performs better than the cox regressi...
متن کاملبرازش توابع انتقالی خاک با استفاده از رگرسیون فازی
Pedotransfer functions are the predictive models of a certain soil property from other easily, routinely, or cheaply measured properties. The common approach for fitting the pedotransfer functions is the use of the conventional statistical regression method. Such an approach is heavily based on the crisp obervations and also the crisp relations among variables. In the modeling natural systems, ...
متن کاملبرازش توابع انتقالی خاک با استفاده از رگرسیون فازی
Pedotransfer functions are the predictive models of a certain soil property from other easily, routinely, or cheaply measured properties. The common approach for fitting the pedotransfer functions is the use of the conventional statistical regression method. Such an approach is heavily based on the crisp obervations and also the crisp relations among variables. In the modeling natural systems, ...
متن کاملAnalysis of Test Day Milk Yield by Random Regression Models and Evaluation of Persistency in Iranian Dairy Cows
Variace / covariance components of 227118 first lactaiom test-day milk yield records belonged to 31258 Iranian Holstein cows were estimated using nine random regression models. Afterwards, different measures of persistency based on estimation breeding value were evaluated. Three functions were used to adjust fixed lactation curve: Ali and Schaeffer (AS), quadratic (LE3) and cubic (LE4) order of...
متن کاملEstimation of Industrial Production Costs, Using Regression Analysis, Neural Networks or Hybrid Neural - Regression Method?
Estimation (Forecasting) of industrial production costs is one of the most important factor affecting decisions in the highly competitive markets. Thus, accuracy of the estimation is highly desirable. Hibrid Regression Neural Network is an approach proposed in this paper to obtain better fitness in comparison with Regression Analysis and the Neural Network methods. Comparing the estimated resul...
متن کامل